Skip to main content

Error Handling & Fault-Tolerant Mindset (Backend)

Core Philosophy

  • Errors are:

    • Inevitable in backend systems
  • Goal is NOT to avoid errors, but to:

    • Detect early
    • Handle gracefully
    • Recover reliably

Key Principle

The best error handling starts BEFORE the error happens.

Types of Errors

1. Logic Errors

  • Code runs successfully but:

    • Produces incorrect results

Example

  • Discount applied twice → financial loss

Why They Happen

  • Misunderstood requirements
  • Incorrect algorithm implementation
  • Missing edge cases

Risk

  • Silent failures
  • Can go unnoticed for weeks/months

2. Database Errors

a. Connection Errors

  • Cannot connect to DB

  • Causes:

    • Network issues
    • DB overload
    • Connection pool exhaustion

b. Constraint Violations

  • Violating DB rules

Examples

  • Unique constraint:

    • Duplicate email
  • Foreign key:

    • Referencing non-existent record

Root Cause

  • Weak validation layer

c. Query Errors

  • Malformed SQL

Examples

  • Wrong table name
  • Syntax errors
  • Complex queries timing out

d. Deadlocks

  • Circular dependency between transactions

3. External Service Errors

  • Dependencies:

    • Email providers
    • Payment gateways
    • Auth providers
    • Cloud services

Common Issues

a. Network Failures

  • Timeouts
  • DNS failures
  • Network partitions

b. Authentication Errors

  • Invalid credentials
  • Expired tokens
  • Permission issues

c. Rate Limiting

  • Too many requests → HTTP 429

Strategy

  • Exponential Backoff

    • Retry with increasing delay

d. Service Outages

  • External service goes down

Solution

  • Fallback systems
  • Graceful degradation

4. Input Validation Errors

  • Caused by bad user input

Types

  • Format validation

    • Email, phone, date
  • Range validation

    • Length, numeric limits
  • Required fields

Response

  • Return 400 Bad Request

5. Configuration Errors

  • Missing/incorrect environment variables

When They Occur

  • Moving between:

    • Dev → Staging → Production

Best Practice

  • Validate configs at startup
  • Fail fast if missing

The best error handling starts before error happens

Prevention Strategies

1. Proactive Error Detection

  • Detect issues before damage

Health Checks

Basic

  • /health endpoint → returns 200

Advanced

  • DB connectivity checks
  • Query performance checks
  • External service checks

2. Monitoring & Observability

  • Detect errors in real-time

Track

  • Error rates
  • Response times
  • Resource usage
  • Throughput

Business Metrics

  • Successful transactions
  • Authentication success rate

Logging

  • Use structured logs (JSON)

  • Include:

    • Metadata
    • Context

Error Handling Philosophy

1. Immediate Error Response

Recoverable Errors

  • Use:

    • Retry
    • Exponential backoff

Non-Recoverable Errors

  • Use:

    • Graceful degradation
    • Fallback systems
    • Disable non-critical features
    • Containment

2. Error Recovery Strategies

Automatic

  • Restart services
  • Clear corrupted cache
  • Switch to backup

Manual

  • Requires human intervention

  • Must:

    • Be documented
    • Be tested

Data Protection

  • Ensure:

    • Backups
    • Transaction logs
    • Restore mechanisms

3. Error Propagation

  • Bubble errors up with context

Mechanism

  • try-catch / exceptions

Goal

  • Add context at each layer

  • Try catch or Extensive logged Error Handling should not happen, as it can give hints to Hackers.


4. Isolation (Fault Containment)

  • Prevent error spread

Techniques

  • Separate processes
  • Timeouts
  • Message queues

Global Error Handling (Final Safety Net)

Usually in middleware .

Architecture Flow

  • Route → Handler → Service → Repository

Strategy

  • Errors:

    • Thrown from any layer
    • Bubble up to global handler

Responsibilities

  • Identify error type
  • Map to correct HTTP response
  • Send user-friendly message

Examples

Validation Error

  • Response:

    • 400 Bad Request

Unique Constraint Error

  • Response:

    • 400
    • Message: "Resource already exists"

Not Found Error

  • DB returns no rows

  • Response:

    • 404 Not Found

Foreign Key Error

  • Invalid reference

  • Response:

    • 404

Benefits

1. Robustness

  • No missed error cases
  • That means if we encounter any missed error, we can just show '500 Internal Error'. Wen don't know what happened. This is the benefit of having Global Error Handling/ Catching Layer.

2. Reduced Redundancy

  • Centralized logic
  • Avoid duplication

Security in Error Handling

1. Avoid Information Leakage

Never expose:

  • Table names
  • Internal errors
  • Stack traces

Use Generic Messages

  • Example:

    • "Something went wrong"

2. Authentication Errors

Bad Practice

  • "User does not exist"
  • "Password incorrect"

Good Practice

  • "Invalid email or password"

Why?

  • Prevents:

    • User enumeration attacks

3. Logging Best Practices

Do NOT log:

  • Passwords
  • API keys
  • Credit card details

Log Instead:

  • User ID
  • Correlation ID

Best Practices Summary

Design Principles

  • Expect failures
  • Design for recovery
  • Fail fast on config errors
  • Validate inputs strictly

System Design

  • Use:

    • Health checks
    • Monitoring
    • Logging
    • Retry mechanisms
    • Fallback systems

Error Handling

  • Centralize (global handler)
  • Classify errors properly
  • Return meaningful but safe messages

Final Takeaway

Robust backend systems are not built by avoiding errors, 
but by expecting, handling, and recovering from them intelligently.

https://owasp.org/www-project-top-ten/ https://owasp.org/www-project-cheat-sheets/ - Very Important Reading